In this case study, we will use ProjecTILs to interpret human scRNA-seq T cell data in the context of a murine TIL atlas. We are going to use the single-cell dataset generated by Sade-Feldman GSE120575 to illustrate interspecies mapping of human cells on a stable murine atlas using gene orthologs.

Background

In the study by Sade-Feldman et al. (2018) Cell, the authours characterize transcriptional profiles of immune cells from melanoma patients before and after immune checkpoint blockade, with the goal to identify factors that associate with success or failure of checkpoint therapy. They reported that certain CD8+ T cell states associate with clinical outcome, and in particular that the transcription factor TCF7 is a significant marker for response to checkpoint therapy. This suggests that the state of T cells found within a tumor is critical for induction of effective immune tumor rejection, highlighting the importance of characterizing (and potentially intervening on) the diversity of tumor-infiltrating T cells.

The single-cell expression data GSE120575 consists of CD45+ single cells from 48 tumor biopsies of melanoma patients before or after checkpoint inhibitors treatment, and sequenced using the Smart-seq2 protocol. Meta-data on response to therapy (Responder vs. Non-responder) is also available on the same GEO identifier.

Here we use the development version of ProjecTILs >0.4.1, which implements ortholog gene conversion between human and mouse.

R Environment

Check & load R packages

Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS="true")

if (!requireNamespace("renv")) 
  install.packages("renv")
library(renv)
renv::restore()

#Load development version of ProjecTILs
remotes::install_git("https://gitlab.unil.ch/carmona/ProjecTILs.git", ref = "v0.4.1")

library(ProjecTILs)
library(gridExtra)

scRNA-seq data preparation

Download the count matrix and metadata from Gene Expression Omnibus (GEO), and store as Seurat object.

cached.object <- "SadeFeldman.seurat.rds"

if(!file.exists(cached.object)){
  
  library(GEOquery)
  geo_acc <- "GSE120575"
  datadir <- "input/SadeFeldman"
  gse <- getGEO(geo_acc)
  
  series <- paste0(geo_acc, "_series_matrix.txt.gz")
  
  system(paste0("mkdir -p ", datadir))
  getGEOSuppFiles(geo_acc,baseDir=datadir)
  
  ##Load expression matrix and metadata
  exp.mat <- read.delim(sprintf("%s/%s/GSE120575_Sade_Feldman_melanoma_single_cells_TPM_GEO.txt.gz", datadir, geo_acc), header=F, sep="\t")
  genes <- exp.mat[c(-1,-2),1]
  cells <- as.vector(t(exp.mat[1,2:16292]))
  samples <- as.factor(t(exp.mat[2,2:16292]))
  
  exp.mat <- exp.mat[c(-1,-2), 2:16292]
  colnames(exp.mat) <- cells
  rownames(exp.mat) <- genes
  
  meta <- read.delim(sprintf("%s/%s/GSE120575_patient_ID_single_cells.txt.gz", datadir, geo_acc), header = T, sep = "\t", skip = 19, nrows = 16291)
  meta <- meta[,1:7]
  
  treat <- factor(ifelse(grepl("Post", samples),'Post','Pre'))
  response <- factor(meta$characteristics..response)
  therapy <- factor(meta$characteristics..therapy)
  
  ##Create Seurat object and add meta data
  query.object <- CreateSeuratObject(counts = exp.mat, project = "SadeFeldman", min.cells = 10, min.features = 50)
  rm(exp.mat)
  query.object@meta.data$Sample <- samples
  query.object@meta.data$Time <- treat
  query.object@meta.data$Response <- response
  query.object@meta.data$Therapy <- therapy
  
  saveRDS(query.object, file=cached.object)
} else {
  query.object <- readRDS(cached.object)
}

Some basic statistics - cells per group (Pre vs. Post, Therapy, Responder vs. Non-responder)

table(query.object$Time)

 Post   Pre 
10363  5928 
table(query.object$Therapy)

    anti-CTLA4 anti-CTLA4+PD1       anti-PD1 
           517           4121          11653 
table(query.object$Response)

Non-responder     Responder 
        10727          5564 
table(query.object$Sample)

                  Post_P1                 Post_P1_2                  Post_P10 
                      291                       353                       308 
      Post_P10_T_enriched                  Post_P11                  Post_P12 
                       88                       313                       323 
                 Post_P13       Post_P13_T_enriched                  Post_P14 
                      265                       184                       329 
                 Post_P15                  Post_P16       Post_P16_T_enriched 
                      304                       369                        94 
                 Post_P17 Post_P17_myeloid_enriched                  Post_P18 
                      238                        43                       357 
                 Post_P19 Post_P19_myeloid_enriched                   Post_P2 
                      328                        75                       301 
       Post_P2_T_enriched                  Post_P20                  Post_P21 
                       79                       358                       213 
                 Post_P22                  Post_P23                Post_P23_2 
                      260                       358                       362 
                 Post_P28                Post_P28_2                   Post_P3 
                      369                       372                       359 
                Post_P3_2                  Post_P30                   Post_P4 
                      358                       369                       421 
       Post_P4_T_enriched                   Post_P5                 Post_P5_2 
                      254                       292                       336 
                  Post_P6        Post_P6_T_enriched                   Post_P7 
                      308                        92                       231 
                  Post_P8        Post_P8_T_enriched                    Pre_P1 
                      327                        82                       229 
                  Pre_P12                   Pre_P15                    Pre_P2 
                      330                       304                       337 
                  Pre_P20                   Pre_P24                   Pre_P25 
                      323                       338                       371 
                  Pre_P26                   Pre_P27                   Pre_P28 
                      333                       343                       337 
                  Pre_P29                    Pre_P3                   Pre_P31 
                      163                       245                       351 
                  Pre_P33                   Pre_P35                    Pre_P4 
                      452                       238                       311 
                   Pre_P6                    Pre_P7                    Pre_P8 
                      288                       368                       267 

Select only baseline (Pre-treatment) samples

query.object <- subset(query.object, subset = Time == "Pre")
table(query.object$Sample)

                  Post_P1                 Post_P1_2                  Post_P10 
                        0                         0                         0 
      Post_P10_T_enriched                  Post_P11                  Post_P12 
                        0                         0                         0 
                 Post_P13       Post_P13_T_enriched                  Post_P14 
                        0                         0                         0 
                 Post_P15                  Post_P16       Post_P16_T_enriched 
                        0                         0                         0 
                 Post_P17 Post_P17_myeloid_enriched                  Post_P18 
                        0                         0                         0 
                 Post_P19 Post_P19_myeloid_enriched                   Post_P2 
                        0                         0                         0 
       Post_P2_T_enriched                  Post_P20                  Post_P21 
                        0                         0                         0 
                 Post_P22                  Post_P23                Post_P23_2 
                        0                         0                         0 
                 Post_P28                Post_P28_2                   Post_P3 
                        0                         0                         0 
                Post_P3_2                  Post_P30                   Post_P4 
                        0                         0                         0 
       Post_P4_T_enriched                   Post_P5                 Post_P5_2 
                        0                         0                         0 
                  Post_P6        Post_P6_T_enriched                   Post_P7 
                        0                         0                         0 
                  Post_P8        Post_P8_T_enriched                    Pre_P1 
                        0                         0                       229 
                  Pre_P12                   Pre_P15                    Pre_P2 
                      330                       304                       337 
                  Pre_P20                   Pre_P24                   Pre_P25 
                      323                       338                       371 
                  Pre_P26                   Pre_P27                   Pre_P28 
                      333                       343                       337 
                  Pre_P29                    Pre_P3                   Pre_P31 
                      163                       245                       351 
                  Pre_P33                   Pre_P35                    Pre_P4 
                      452                       238                       311 
                   Pre_P6                    Pre_P7                    Pre_P8 
                      288                       368                       267 

ProjecTILs

Load reference TIL atlas - if it’s not present in the working directory, it will be downloaded from the repository

ref <- load.reference.map()
[1] "Loading Default Reference Atlas..."
[1] "/Users/admin/gitHub/ProjecTILs_CaseStudies/ref_TILAtlas_mouse_v1.rds"
[1] "Loaded Reference map ref_TILAtlas_mouse_v1"

Run ProjecTILs projection algorithm - version 0.4.1 supports human ortholog conversion (human.ortho=T)

query.projected <- make.projection(query.object, ref=ref, human.ortho = T)
[1] "Using assay RNA for query"
[1] "1685 out of 5928 ( 28 % ) non-pure T cells removed.  Use filter.cells=FALSE to avoid pre-filtering (NOT RECOMMENDED)"
[1] "Transforming expression matrix into space of mouse orthologs"
[1] "Aligning query to reference map for batch-correction..."

Projecting corrected query onto Reference PCA space
[1] "Projecting corrected query onto Reference UMAP space"

Plot global projection of human TIL data over the reference in UMAP space (non-T cell transcriptomes have been automatically removed)

plot.projection(ref, query.projected)

Predict the cell states in the query set

query.projected <- cellstate.predict(ref=ref, query=query.projected)
table(query.projected$functional.cluster)

     CD4_NaiveLike     CD8_EarlyActiv CD8_EffectorMemory      CD8_NaiveLike 
                82                413                789               1071 
           CD8_Tex           CD8_Tpex                Tfh                Th1 
               583                188                170                612 
              Treg 
               335 

Interestingly, expression of marker genes of projected human TILs correspond fairly well to those of the murine reference atlas: e.g. Pdcd1, Havcr2 and Entpd1 expression in terminally exhausted CD8 TILs (CD8_Tex), co-expression of Pdcd1, Tox and Tcf7 (at a modest level) in the CD8_Tpex state with a higher expression of Ifng; high expression of Tcf7, Ccr7 with low expression of Pdcd1 or cytotoxic molecules in the Naive-like states (that might include central memory cells); Cxcr5 and Tox coexpression in follicular helper T cells; Foxp3, Tox, Havcr2, Entpd1 in regulatory CD4 T cells, etc.

query.list <- SplitObject(query.projected, split.by = "Response")
plot.states.radar(ref, query=query.list,min.cells = 50, 
                  genes4radar = c("Foxp3","Cd4","Cd8a","Tcf7","Ccr7","Gzmb","Pdcd1","Havcr2","Tox","Entpd1","Cxcr5","Ifng"))

Note that projections and comparisons are performed in the ortholog space of murine genes - to check the names of human-mouse orthologs you can examine the conversion table for genes of interest:

data(Hs2Mm.convert.table)
which.genes <- c("TCF7","GZMB","CD8B","PDCD1")

Hs2Mm.convert.table[Hs2Mm.convert.table$Gene.HS %in% which.genes, ]
      Gene.stable.ID.HS Gene.HS Gene.MM                         Alt.symbol
7381    ENSG00000254126    CD8B   Cd8b1                              Cd8b1
7835    ENSG00000188389   PDCD1   Pdcd1                              Pdcd1
11096   ENSG00000081059    TCF7    Tcf7                               Tcf7
16688   ENSG00000100453    GZMB    Gzmb Gzmd,Gzme,Gzmf,Gzmc,Gzmn,Gzmg,Gzmb
      Alt.symbol.HS
7381     CD8B2,CD8B
7835          PDCD1
11096          TCF7
16688          GZMB

Response to therapy

Now to the interesting part. Let’s visualize the projection and TIL contexture of tumors that responded or not responded following immune checkpoint blockade:

query.list <- SplitObject(query.projected, split.by = "Response")

pll <- list()

pll[[1]] <- plot.projection(ref, query.list[["Responder"]]) + ggtitle("Responder")
pll[[2]] <- plot.statepred.composition(ref, query.list[["Responder"]], metric="Percent") + ggtitle("Responder")  + ylim(0,40)
pll[[3]] <- plot.projection(ref, query.list[["Non-responder"]]) + ggtitle("Non-responder")
pll[[4]] <- plot.statepred.composition(ref, query.list[["Non-responder"]], metric="Percent") + ggtitle("Non-responder") + ylim(0,40)

grid.arrange(grobs=pll, ncol=2, nrow=2, widths=c(1.5,1))

In non-responders, there is a clear enrichment in the terminally exhausted CD8 state (CD8_Tex), while responders are enriched in Naive-like states.

To better examine differences in TIL contexture between responders and non-responders, we can visualize the fold-change of T cell state frequency between the two groups:

which.types <- table(query.projected$functional.cluster)>20

stateColors_func <- c("#edbe2a","#A58AFF","#53B400","#F8766D","#00B6EB","#d1cfcc","#FF0000","#87f6a5","#e812dd")
states_all <- levels(factor(ref$functional.cluster))
names(stateColors_func) <- states_all
cols_use <- stateColors_func[names(which.types)][which.types]

#Responder vs non Responder
query.list <- SplitObject(query.projected, split.by = "Response")

norm.c <- table(query.list[["Non-responder"]]$functional.cluster)/sum(table(query.list[["Non-responder"]]$functional.cluster))
norm.q <- table(query.list[["Responder"]]$functional.cluster)/sum(table(query.list[["Responder"]]$functional.cluster))

foldchange <- norm.q[which.types]/norm.c[which.types]
foldchange <- sort(foldchange,decreasing = T)

tb.m <- melt(foldchange)
colnames(tb.m) <- c("Cell_state","Fold_change")
pll <- list()
ggplot(tb.m, aes(x=Cell_state, y=Fold_change, fill=Cell_state)) + geom_bar(stat="identity") +
  scale_fill_manual(values=cols_use) + geom_hline(yintercept = 1) + scale_y_continuous(trans='log2') +
  theme(axis.text.x=element_blank(), legend.position="left") + ggtitle("Responder vs. Non-responder") 

CD4 and CD8 Naive-like states are the most enriched in responders compared to non-responders, while terminally exhausted Entpd1+ Havcr2+ CD8 TILs are the most under-represented, confirming the observation of the original paper that a higher frequency of TCF7+ CD8 TILs is associated with response to immune checkpoint therapy.

Conclusions

Taking advantage of the ortholog mapping functionality of ProjecTILs, we have illustrated how to effortlessly analyze human scRNA-seq data in the context of a reference murine TIL atlas. Gene expression profiles confirmed that T cells are accurately projected in major CD4+ and CD8+ categories, as well as in more specific subtypes (CD8_Tex, CD8_Tpex, naive-like, Follicular helper, Th1, and regulatory CD4 cells). Comparing the baseline transcriptomics profile of TILs from melanoma patients responding and not responding to checkpoint inhibitors, and their composition in terms of cell states, confirms the observation of the original study that TCF7 correlates with checkpoint therapy responsivenss.

However, the enriched TCF7+ CD8 TIL population corresponds to a naive-like/central memory-like state that does not resembles the (Pdcd1+ Tox+) Precursor exhausted state well described in murine cancer and chronic infections. We also observed that non-responding biopsies are enriched in regulatory CD4 T cells and deplteted from helper T cells, which might also contribute to the reduced anti-tumor response following immunotherapy. ProjecTILs analysis in the context of a stable atlas gave a more complete picture of the T cell states that are associated with immunotherapy response.

Further reading

Original publication - Sade-Feldman et al. (2018) Cell

ProjecTILs case studies - INDEX - Repository